Goto

Collaborating Authors

 accurate stochastic optimization


Robust, Accurate Stochastic Optimization for Variational Inference

Neural Information Processing Systems

We examine the accuracy of black box variational posterior approximations for parametric models in a probabilistic programming context. The performance of these approximations depends on (1) how well the variational family approximates the true posterior distribution, (2) the choice of divergence, and (3) the optimization of the variational objective. We show that even when the true variational family is used, high-dimensional posteriors can be very poorly approximated using common stochastic gradient descent (SGD) optimizers. Motivated by recent theory, we propose a simple and parallel way to improve SGD estimates for variational inference. The approach is theoretically motivated and comes with a diagnostic for convergence and a novel stopping rule, which is robust to noisy objective functions evaluations. We show empirically, the new workflow works well on a diverse set of models and datasets, or warns if the stochastic optimization fails or if the used variational distribution is not good.


Robust, Accurate Stochastic Optimization for Variational Inference

Neural Information Processing Systems

We examine the accuracy of black box variational posterior approximations for parametric models in a probabilistic programming context. The performance of these approximations depends on (1) how well the variational family approximates the true posterior distribution, (2) the choice of divergence, and (3) the optimization of the variational objective. We show that even when the true variational family is used, high-dimensional posteriors can be very poorly approximated using common stochastic gradient descent (SGD) optimizers. Motivated by recent theory, we propose a simple and parallel way to improve SGD estimates for variational inference. The approach is theoretically motivated and comes with a diagnostic for convergence and a novel stopping rule, which is robust to noisy objective functions evaluations.


Review for NeurIPS paper: Robust, Accurate Stochastic Optimization for Variational Inference

Neural Information Processing Systems

Summary and Contributions: In this paper, the authors study the stochastic optimization algorithm for variational inference. In particular, the authors argue that existing methods stochastic optimization techniques for variational inference are fragile with respect to the hyperparameters of the optimization algorithm. Mainly, authors argue that the standard stopping rule for a stochastic optimization for variational inference is insufficient. The authors view the SGD algorithm with ELBO objective as a Markov chain with a stationary distribution centered around the true variational posterior. The main contribution of this paper are: a) to use iterate averaging to determine the parameter of the variational posterior.


Review for NeurIPS paper: Robust, Accurate Stochastic Optimization for Variational Inference

Neural Information Processing Systems

The reviewers have pointed out a variety of areas where the paper can be improved. I feel that the authors can address these points in modifying their manuscript for the camera ready, especially by inserting a discussion about how their work ties into Dieuleveut, A., Durmus, A. & Bach, F. (2020). I encourage them to also implement the reviewers other concerns.


Robust, Accurate Stochastic Optimization for Variational Inference

Neural Information Processing Systems

We examine the accuracy of black box variational posterior approximations for parametric models in a probabilistic programming context. The performance of these approximations depends on (1) how well the variational family approximates the true posterior distribution, (2) the choice of divergence, and (3) the optimization of the variational objective. We show that even when the true variational family is used, high-dimensional posteriors can be very poorly approximated using common stochastic gradient descent (SGD) optimizers. Motivated by recent theory, we propose a simple and parallel way to improve SGD estimates for variational inference. The approach is theoretically motivated and comes with a diagnostic for convergence and a novel stopping rule, which is robust to noisy objective functions evaluations.